Identifying diachronic change in semantic relations
نویسنده
چکیده
test the hypothesis that words which are similar in meaning occur in similar contexts, this being a corollary of the initial hypothesis that the meaning of a word is defined by the collocational company which it keeps. This similarity is measured by creating 'collocational profiles'-summaries of the words typically occurring in the context of a given word-for each unique word form (type) in a large corpus of newspaper data. By comparing the profile of the target word, the 'headword' in thesaurus or dictionary terms, with the profiles of the other words in the corpus, a set of words can be established which are semantically linked to the target word; these are referred to as 'nyms', since they could be in a traditional thesaural relationship with each other, for instance, as synonyms or antonyms, or may stand in a previously undefined relationship to the target word. The findings from ACRONYM will help to overcome several of the difficulties which have hitherto been involved in the creation of both printed and electronic thesauri. One of these, which forms the topic of this paper, is the problem of keeping the thesaurus up-to-date with the evolution of the language and the changes in real-world referents-the incumbents of government posts, as an example. This issue affects the thesaural facilities incorporated in many software applications, such as word processors, but is particularly problematic in database search engines accessing databases which contain material which is regularly updated. An example of this would be the textual databases held by Financial Times Electronic Publishing, one of the industrial partners of ACRONYM, who provide online search access to past issues of national daily newspapers spanning back several years. It is not hard to imagine how quickly the semantic equivalents described in a thesaurus in such a system can become outdated. In order to achieve the greatest possible accuracy in the identification of nyms, as much data as is available is generally used in deriving the collocational profiles. The ACRONYM corpus database system was designed in such a way,
منابع مشابه
A State-of-the-Art of Semantic Change Computation
This paper reviews state-of-the-art of one emerging field in computational linguistics — semantic change computation, proposing a framework that summarizes the literature by identifying and expounding five essential components in the field: diachronic corpus, diachronic word sense characterization, change modelling, evaluation data and data visualization. Despite the potential of the field, the...
متن کاملRecent Developments in Spanish (and Romance) Historical Semantics
Diachronic semantics has long been the stepchild of Spanish (and Romance) historical linguistics. Although many studies have examined (often in searching detail) the semantic evolution of individual lexical items, Hispanists have ignored broader patterns of semantic change and the relevant theoretical and methodological issues posed by this phenomenon. Working within the framework of cognitive ...
متن کاملIdentifying Diachronic Topic-Based Research Communities by Clustering Shared Research Trajectories
Communities of academic authors are usually identified by means of standard community detection algorithms, which exploit ‘static’ relations, such as co-authorship or citation networks. In contrast with these approaches, here we focus on diachronic topic-based communities –i.e., communities of people who appear to work on semantically related topics at the same time. These communities are inter...
متن کاملVerbs Change More than Nouns: a Bottom-up Computational Approach to Semantic Change
Linguists have identified a number of types of recurrent semantic change, and have proposed a number of explanations, usually based on specific lexical items. This paper takes a different approach, by using a distributional semantic model to identify and quantify semantic change across an entire lexicon in a completely bottom-up fashion, and by examining which distributional properties of words...
متن کاملDiachronic semantic cohesion for topic segmentation of TV broadcast news
This paper proposes a new way to integrate semantic relations into a topic segmentation process by defining the notion of semantic cohesion. In the context of a sliding window based automatic topic segmentation algorithm, semantic relations are incorporated in the similarity measure between adjacent blocs. Additionaly, in the context of TV Brodcast News topic segmentation, we propose a new prot...
متن کاملSimulation of Language Evolution based on Actual Diachronic Change Extracted from Legal Terminology
Simulation studies have played an important role in language evolution. Although a variety of methodologies have been proposed so far, they are typically too abstract to recognize that their learning mechanisms properly reflect actual ones. One reason comes from the lack of empirical data recorded for a long period with explicit description. Our purpose in this paper is to show simulation model...
متن کامل